fix: memory not released after indexing (20GB+ RSS for 5MB data)#833
fix: memory not released after indexing (20GB+ RSS for 5MB data)#833fxfxfx123 wants to merge 3 commits into
Conversation
- Set mi_option_arena_purge_mult=1 (default 10) so arenas are purged aggressively without extra delay - Set mi_option_page_reclaim_on_free=1 to reclaim pages from exited worker thread heaps - Lower DEFAULT_RAM_FRACTION from 0.5 to 0.25 to reduce memory budget Signed-off-by: fxfxfx123 <93531292+fxfxfx123@users.noreply.github.com>
Memory scales linearly with worker count (each gets its own mimalloc arena + 8MB stack). Diminishing returns past 8 workers. On a 20-core CPU this reduces peak memory by up to 60% with negligible speed loss. Signed-off-by: fxfxfx123 <93531292+fxfxfx123@users.noreply.github.com>
Move inline trailing comments to preceding lines to match project style and satisfy clang-format-20. Signed-off-by: fxfxfx123 <93531292+fxfxfx123@users.noreply.github.com>
|
Thanks for the Windows memory-retention fix for #832. Triage: high-priority stability/performance PR. Review will check the mimalloc options and worker-count change separately: we need post-index RSS to return to sane levels, but we also need to avoid an over-broad throughput regression or conflicting with the explicit memory-budget work in #752/#685. Please keep the before/after memory evidence current in the PR. |
|
Quick status note: this PR is one of four open memory/RAM-policy changes (#833, #752, #586, #685) that we've reviewed individually and found genuinely complementary — so rather than merging them piecemeal, we're doing a combined design pass over the whole memory policy (explicit override, host-tiered defaults, retention bounds, post-index release, and the Windows auto-sync driver in #841) and will respond here with a concrete direction shortly. Your work is very much part of that plan — thanks for your patience! |
|
Thank you — your diagnosis found the right lever. The core of the #832 fix is Being straight with you: #832 isn't fully closed yet. The keystone also routes the two background index paths through a subprocess (so the kernel returns 100% of each cycle's RSS on exit), and #854 added budget-derived retention — so 'RSS won't come back' is fixed. But the trigger you hit on Windows (the watcher re-indexing on every poll even when nothing changed) is tracked separately as #841 and still to come. So I'll leave #832 open until that lands, but your page_reclaim finding is shipped. Really appreciate the sharp report — closing this PR in favor of the folded fix. |
Problem
After indexing a small project (65 files, 1.3MB, 2509 nodes), codebase-memory-mcp retains 20GB+ RSS on a 32GB Windows machine. Memory grows monotonically and is never released to the OS.
Root Cause (from source code)
Fix
Benchmark
Tested on Windows 11, i7-12700, 32GB RAM, 65-file project.